Recurrent Pseudo Relevance Feedback on Web Collections
نویسنده
چکیده
Various Relevance Feedback techniques exist in Information Retrieval such as Simulated Relevance Feedback and Pseudo Relevance Feedback. In a Simulated Relevance Feedback technique a new query is reformulated based on the documents selected by the user from the top-ranked documents whereas in a Pseudo Relevance Feedback, the query is reformulated based on the assumption that N top-ranked documents are relevant. A new Relevance Feedback technique has been developed in this paper which provides better retrieval effectiveness than Pseudo Relevance Feedback at high precision levels at low recall levels. The new technique, called Recurrent Pseudo Relevance Feedback (RPRF), goes into a repetitive process by considering the top-ranked documents that were initially retrieved from Pseudo Relevance Feedback to be the new top-ranked documents. RPRF shows greater retrieval effectiveness in the Probabilistic model using the DB2 model in the context of the Divergence from Randomness (DFR) framework. However, it is also shown that retrieval effectiveness increases as the value of N decreases on the second feedback process. The effectiveness of the results is assessed based on the measurement of recall and precision ratios using the Interpolated Mean Average Precision (MAP) and was tested on the Text Retrieval Conference (TREC) Web collections of 10 and 18 Gigabytes WT10G and GOV .
منابع مشابه
Extract-biased pseudo-revelance feedback
Successfully retrieving a web document is a twofold problem: having an adequate query that can usefully and properly help filtering relevant documents from huge collections, and presenting the user those that will indeed fulfill his/her needs. In this paper, we focus on the first issue – the problem of having a misleading user query. The aim of the work is to refine a query by using extracts in...
متن کاملFlexible Pseudo-Relevance Feedback via Direct Mapping and Categorization of Search Requests
This paper explores various strategies for enhancing the reliability of pseudo-relevance feedback using TREC and NTCIR test collections. For each test request, the number of pseudo-relevanct documents ( ) or the number of expansion terms ( ) is determined based on a similar training request (i.e. via direct mapping) or a group of similar training requests (i.e. via categorization). The results ...
متن کاملDatabase Selection and Result Merging in P2P Web Search
Intelligent Web search engines are extremely popular now. Currently, only the commercial centralized search engines like Google can process terabytes of Web data. Alternative search engines fulfilling collaborative Web search on a voluntary basis are usually based on a blooming Peer-to-Peer (P2P) technology. In this paper, we investigate the effectiveness of different database selection and res...
متن کاملQuery expansion based on relevance feedback and latent semantic analysis
Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...
متن کاملLearning to expand queries using entities
A substantial fraction of web search queries contain references to entities, such as persons, organizations, and locations. Recently, methods that exploit named entities have been shown to be more effective for query expansion than traditional pseudo-relevance feedback methods. In this paper, we introduce a supervised learning approach that exploits named entities for query expansion, using Wik...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012